Search CORE

70 research outputs found

Detecting Semantic Parts on Partially Occluded Objects

Author: Wang Jianyu
Xie Cihang
Xie Lingxi
Yuille Alan
Zhang Zhishuai
Zhu Jun
Publication venue
Publication date: 25/07/2017
Field of study

In this paper, we address the task of detecting semantic parts on partially occluded objects. We consider a scenario where the model is trained using non-occluded images but tested on occluded images. The motivation is that there are infinite number of occlusion patterns in real world, which cannot be fully covered in the training data. So the models should be inherently robust and adaptive to occlusions instead of fitting / learning the occlusion patterns in the training data. Our approach detects semantic parts by accumulating the confidence of local visual cues. Specifically, the method uses a simple voting method, based on log-likelihood ratio tests and spatial constraints, to combine the evidence of local cues. These cues are called visual concepts, which are derived by clustering the internal states of deep networks. We evaluate our voting scheme on the VehicleSemanticPart dataset with dense part annotations. We randomly place two, three or four irrelevant objects onto the target object to generate testing images with various occlusions. Experiments show that our algorithm outperforms several competitors in semantic part detection when occlusions are present.Comment: Accepted to BMVC 2017 (13 pages, 3 figures

arXiv.org e-Print Archive

DSpace@MIT

DeepVoting: A Robust and Explainable Deep Network for Semantic Part Detection under Partial Occlusion

Author: Wang Jianyu
Xie Cihang
Xie Lingxi
Yuille Alan L.
Zhang Zhishuai
Publication venue
Publication date: 29/03/2018
Field of study

In this paper, we study the task of detecting semantic parts of an object, e.g., a wheel of a car, under partial occlusion. We propose that all models should be trained without seeing occlusions while being able to transfer the learned knowledge to deal with occlusions. This setting alleviates the difficulty in collecting an exponentially large dataset to cover occlusion patterns and is more essential. In this scenario, the proposal-based deep networks, like RCNN-series, often produce unsatisfactory results, because both the proposal extraction and classification stages may be confused by the irrelevant occluders. To address this, [25] proposed a voting mechanism that combines multiple local visual cues to detect semantic parts. The semantic parts can still be detected even though some visual cues are missing due to occlusions. However, this method is manually-designed, thus is hard to be optimized in an end-to-end manner. In this paper, we present DeepVoting, which incorporates the robustness shown by [25] into a deep network, so that the whole pipeline can be jointly optimized. Specifically, it adds two layers after the intermediate features of a deep network, e.g., the pool-4 layer of VGGNet. The first layer extracts the evidence of local visual cues, and the second layer performs a voting mechanism by utilizing the spatial relationship between visual cues and semantic parts. We also propose an improved version DeepVoting+ by learning visual cues from context outside objects. In experiments, DeepVoting achieves significantly better performance than several baseline methods, including Faster-RCNN, for semantic part detection under occlusion. In addition, DeepVoting enjoys explainability as the detection results can be diagnosed via looking up the voting cues

arXiv.org e-Print Archive

DSpace@MIT

Robust Deep Learning Frameworks for Recognizing and Localizing Objects Accurately and Reliably

Author: Zhang Zhishuai
Publication venue: 'The Busan Gyeongnam Mathematical Society'
Publication date: 16/02/2021
Field of study

Detection is an important task in computer vision. It requires to recognize targets inside images, and localize them. The images can be 2D or 3D, and can be represented by dense pixels or sparse point clouds. With recent emergence and development of deep neural networks, many deep learning based detection frameworks have been proposed. They provide promising performance for many targets, e.g. natural objects, object parts, pedestrians and faces, thus are widely used in many applications, including surveillance, autonomous driving and medical image analysis. However, robust object detection is still challenging. Ideal detectors should be able to handle objects with unknown occluders, different scales/movements, long-tailed difficult objects, and low-contrast radiology inputs. Recent detectors are not designed with deliberate consideration of those challenges, and may have degraded performance. In this dissertation, we investigate those challenges, and propose novel detection frameworks to mitigate them. The aforementioned challenges are addressed in different aspects. (i) We address occlusion by proposing end-to-end voting mechanisms for vehicle part detection. It detects targets by accumulating cues relevant to the target. Occlusions eliminate some of the cues, but remaining cues are still able to detect the targets. (ii) We combine semantic segmentation with object detection, to enrich the detection features in multi-layer single-stage detectors. The enriched features capture both low-level details and high-level semantics, thus the quality of detection is significantly improved for both small and large objects due to stronger detection features. (iii) We investigate the issue of long-tailed hard examples and propose a hard image mining strategy. It dynamically identifies hard images and puts more training efforts during the training phase. This leads to models robust to long-tailed hard examples. (iv) For low-contrast multi-slice medical images, we design hybrid detectors to combine 2D and 3D information. Based on a stack of 2D CNNs for each image slice, we design 3D fusion modules to bridge context information from different 2D CNNs. (v) For objects moving in sequences, we design temporal region proposals to model the movements and interactions of them. We model the moving objects with spatial-temporal-interactive features for detecting them through past, current and future

JScholarship

Joint Inversion of Production and Temperature Data Illuminates Vertical Permeability Distribution in Deep Reservoirs

Author: Zhang Zhishuai
Publication venue
Publication date
Field of study

Characterization of connectivity in compartmentalized deepwater Gulf of Mexico (GoM) reservoirs is an outstanding challenge of the industry that can significantly impact the development planning and recovery from these assets. In these deep formations, temperature gradient can be quite significant and temperature data can provide valuable information about field connectivity, vertical fluid displacement, and permeability distribution in the vertical direction. In this paper, we examine the importance of temperature data by integrating production and temperature data jointly and individually and conclude that including the temperature data in history matching of deep GoM reservoirs can increase the resolution of reservoir permeability distribution map in the vertical direction. To illustrate the importance of temperature measurements, we use a coupled heat and fluid flow transport model to predict the heat and fluid transport in the reservoir. Using this model we ran a series of data integration studies including: 1) integration of production data alone, 2) integration of temperature data alone, and 3) joint integration of production and temperature data. For data integration, we applied four algorithms: Maximum A-Posteriori (MAP), Randomized Maximum Likelihood (RML), Sparsity Regularized Reconstruction and Sparsity Regularized RML methods. The RML and Sparsity Regularized RML approaches were used because they allow for uncertainty quantification and estimation of reservoir heterogeneity at a higher resolution. We also investigated the sensitivity of temperature and production data to the distribution of permeability, which showed that while production data primarily resolved the distribution of permeability in the horizontal direction, the temperature data did not display much sensitivity to permeability in the horizontal extent of the reservoir. The results of these experiments were compelling in that they clearly illuminated the role of temperature data in enhancing the resolution of reservoir permeability maps with depth. We present several experiments that clearly illustrate and support the conclusions of this study

Texas A&M Repository